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Abstract. The replicator equation is interpreted as a continu- 
ous inference equation and a formal similarity between the dis- 
crete replicator equation and Bayesian inference is described. Fur- 
ther connections between inference and the replicator equation are 
given including a discussion of information divergences, evolution- 
ary stability, and exponential families as solutions for the replicator 
dynamic, using Fisher information and information geometry. 



1. Introduction 

To address the question of in what sense is natural selection related 
to information theory and statistical inference, we draw an analogy be- 
tween Bayesian inference and models of natural selection available in 
evolutionary game theory. Exploring this requires the use of informa- 
tion theory which leads into the use of information geometry and the 
common geometric structure of information geometry and evolutionary 
game theory. Recognition of this framework leads to generalizations of 
Bayesian inference and is explored in future work. 

Bayesian inference and the discrete replicator equation share a re- 
markable formal similarity. The continuous replicator dynamic can be 
analyzed using some of the same techniques used in Bayesian infer- 
ence, such as by the Kullback-Liebler divergence and with exponential 
families. This paper describes these similarities and explains how in- 
formation geometry illuminates the connection. 

As a model of natural selection, the replicator dynamic models the 
informatic behavior of the population distribution. Emerging purely 
from information-theoretical constructions, the replicator requires no 
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biological assumptions. This is because the geometry of evolution- 
ary game theory comes from the information geometry of manifolds 
of probability distributions. The theoretical results associated to this 
geometry, such as Fisher's fundamental theorem, are facts describing 
the evolution of information captured by replicating systems. 

1.1. Bayesian Inference. 

Inductive inference is the only process known to us by 
which essentially new knowledge comes into the world. 
- R. A. Fisher, The Design of Experiments (1935) [8] 

Bayesian inference is a discrete dynamical system utilizing Bayes' 
Theorem for iterative dynamic inference. It is widely used in machine 
learning, e.g. in spam filtering and document classification. Define 
the process as follows. Consider a collection of events H\, H 2 , . . . , H n , 
along with Bayes' theorem 

PmE) = P{E ^S} Hl) for i = 1, 2, . . . , n where 

(1) the events constitute the entire state space: Y^i=i ^i^i) = 1> 

(2) P{Hi) is the prior probability of ifj, 

(3) E is an event corresponding to newly encountered evidence 
and P(E) is the marginal probability of E, where P{E) = 

EtiPiEjmpm, 

(4) P(H{\E) is the posterior probability of Hi given the evidence 
E. 

The process adjusts the probabilities of the events H\, . . . , H n in 
light of the evidence provided by the observation E, forming a dynamic 
process 

{P{H X ), P{H n )) -> (P(#i|£), . . . , P(H n \E)), 

which can be iterated over a sequence of observations E x , E 2 , . . .. The 
Kullback-Liebler information divergence Dkl {P{H\E)\\P{H)) is used 
to measure the gain in information from passing to the posterior dis- 
tribution. 

1.2. The Discrete Replicator Dynamic. 

The theory of evolution by cumulative natural selection 
is the only theory we know of that is in principle capable 
of explaining the existence of organized complexity. 
Richard Dawkins, The Blind Watchmaker (1987) [7J 

Consider a population of n types of replicating entities, such as 
genotypes (e.g. possible viral genetic sequences) or phenotypes (e.g. 
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eye color or investment strategies). Let Xi be the proportion of the 
population of the zth type and denote the population distribution 
). The discrete replicator dynamic [6] is: 

Xij~i (xj 

x\ = l - % , for % = 1, 2, . . . , n where 

(1) The types completely describe the population so that £V Xi = 1, 
(i.e. the set of all possible states is the simplex), 

(2) fi(x) is the fitness of type i (dependent on the population dis- 
tribution), / = (fx, ... , f n ) is the fitness landscape, 

(3) f(x) = YH=i x ifi( x ) is t ne average fitness, and 

(4) x\ is the frequency of type % in the next generation of the popu- 
lation, adjusted by the proportionality of fitness relative to the 
average population fitness. 

A population of a particular distribution obtains information about 
the environment from the fitness landscape. Replication adjusts the 
distribution of types in the population as dictated by the relative fitness 
as measured by the fitness landscape. In light of fitness landscape, 
some types proliferate while others decline, much like the probability 
of a particular event is adjusted by Bayes' theorem in light of new 
evidence in Bayesian inference. 

2. Formal Similarity of the Discrete Replicator Dynamic 

and Bayesian Inference 

The following dictionary describes the formal analogy of Bayesian 
inference and the discrete replicator dynamic. This analogy was inde- 
pendently discovered in [T5] . 



Bayesian Inference Discrete Replicator 

Prior Distribution (P(/f 1 ), . . . , P(H n )) Population state X — (^X ~y j . . • j X ^ 

New Evidence P(E\Hi) Fitness landscape fi(x) 

Normalization P(E) Mean fitness f(x) 

Posterior distribution P(Hi\E), . . . , P(H n \E) Population state x' = (x[, . . . , x' n ) 

The fitness landscape provides the observation (evidence) in the in- 
ference process and the population re-aligns proportionally. Bayesian 
inference is a special case, formally, of the discrete replicator dynamic, 
since the fitness landscape in each coordinate may depend on the en- 
tire population distribution rather than only on the proportion of the i- 
type, which is significant if n > 2. In the above formulation of Bayesian 
inference there is no explicit dependence of the probability P(Hj\E) on 
any event Hj with i ^ j. 
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Although the discrete replicator dynamic is intuitively satisifying, 
the nonlinearity of the discrete dynamic makes analysis difficult. A 
continuous version of the replicator equation is widely used and as a 
differential equation has more tractable tools for analysis. 

2.1. The Continuous Replicator Dynamic. The replicator dynamic 
has an intuitive motivation as a general model of natural selection. 
Suppose a function / from the simplex to ~&L n describes the fitness of 
the population states (or mixed-strategies), with the z-th component 
function fi giving the fitness of the i-th type, depending on the entire 
population distribution. The intuitive idea of natural selection, trans- 
lated from Dawkins' Universal Darwinism, is that the relative rate of 
change of the proportion of the z-th type should be given by the differ- 
ence of the fitness of the z-th type and the mean population fitness. In 
equations, 

n 

— = fi(x) -^Xifi(x) = fi(x) - f(x), 

Xi i=l 

where f(x) denotes the mean of f(x). After rearrangement, the repli- 
cator equation takes the form 

Xi = Xi(fi(x) - f(x)). 

The continuous analog can be obtained by a limiting process from 
the discrete dynamic |6j , followed by a change in velocity that does not 
alter the trajectories [10] and a possible gauge transformation. Indeed, 
defining a differential equation from the difference equation, 

Xi(t + h) - Xi{t) , fi(x) fi(x)-f(x) 

hr+O h f( X ) f(x) 

which is equivalent to the following equation after a change in velocity 
because f(x) can be assumed to be strictly positive: 

Xi = Xi (fi{x) - f(x)) . 

This description, in light of the relationship to inference, identifies 
the replicator dynamic as a continuous inference process. Since infor- 
mation divergence plays an important role in Bayesian inference, it is 
natural to study its uses for the replicator dynamic. First we must 
define evolutionary stability. 

3. Evolutionary Stability 

A central question in evolutionary game theory is evolutionary sta- 
bility p^3|[T6] . An evolutionarily stable state (ESS) of the replicator dy- 
namic is a population distribution that is robust to invasion by mutant 



THE REPLICATOR EQUATION AS AN INFERENCE DYNAMIC 



5 



types. Formally, a distribution x on the simplex is called an evolution- 
arily stable state of the replicator dynamic if x- f(x) > x- f(x) (in some 
neighborhood of x). This means that X IS Si better reply to all neigh- 
boring strategies, and hence robust under the action of selection to the 
invasion of nearby mutant strategies. Evolutionarily stable states are 
asymptotic rest points of the replicator dynamic and correspond to the 
concept of strong stability in dynamical systems [6|[T0] . 

3.1. Kullback-Liebler Divergence is a Lyapunov function for 
the Replicator Dynamic. The following theorem shows that the 
Kullback-Liebler information divergence forms a Lyapunov function for 
the replicator dynamic, given an evolutionarily stable state. In fact, 
evolutionary stability is characterized by this property. A version of 
this theorem was proved in [1] and in [3]. A similar result is proven 
in [10], with the Lyapunov function V(x) = Yli x i - 

Theorem 1. The state x is an interior ESS for the replicator dynamic 
if and only if Dkl{x\\x) is a local Lyapunov function. 

Proof Let V(x) = Dkl{x\\x) = ^ijlogXj — ^XjlogXj. Then we 
have that 

V( x ) = = - ^2%i(fi( x ) ~ f( x )) 

i 1 i 

- - ^ Xifi(x) + ^ X if( X ) = - X ifi( X ) + I Xi ) 

i i i \ i / 

= -^Xifiix) + f(x) = -{x ■ f{x) - x ■ f{x)) < 0. 

i 

The last inequality holds if and only if x is an ESS. Finally, by Jensen's 
inequality, D KL is minimized when x = x, so it is a local Lyapunov 
function. □ 

Viewing the state x as the final or equilibrium distribution of the dy- 
namic (analogous to the "true distribution" in an inference context), 
we can interpret the quantity Dkl( x \ \ x ) as the potential information of 
the dynamic system. As the system converges, the potential informa- 
tion is decreasing and is minimized because it is a Lyapunov function. 

This result is the continuous analog to the use of the Kullback-Liebler 
divergence as a measure of information gain of Bayesian inference. 
Within the neighborhood of the ESS, the system is minimizing the in- 
formation divergence between the current population distribution and 
that of the selectively stable configuration, that is it is minimizing the 
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potential information in the system. The theorem shows that this is 
an informatic characterization of evolutionary stability. 

3.2. Potential Information and the Discrete Replicator Dy- 
namic. The potential information Dkl(x\\x) plays an analogous role 
for the discrete replicator dynamic. 

Theorem 2. Suppose that the fitness landscape is strictly positive, that 
is fi(x) > for all i and x. If the population distribution unfolds 
according to the discrete replicator dynamic then x is an interior ESS 
if and only if the potential information is decreasing along iterations of 
the dynamic. 

Proof First note that the ESS condition can be equivalently stated as 

g - iM > 1 

x ■ f(x) 

for all a; in a neighborhood of x (using the assumption that the fitness 
landscape is strictly positive). 

Consider the difference in potential information of two successive 
states P = D KL (x\\x') — D KL (x\\x). Assume that x is in the ESS 
neighborhood of x. Then, 




where the log is moved outside the sum using Jensen's inequality and 
the logarithm in the last line is positive by the ESS condition. □ 

The proof shows that the state x is an ESS if and only if the potential 
information is decreasing along iterations of the dynamic, once again 
giving a characterization of evolutionary stability. 
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3.3. Exponential Families — Solutions of the Continuous Repli- 
cator Equation. In Bayesian Inference, an exponential family pro- 
duces a conjugate prior which is also an exponential family, possibly 
of the same type. The analogous property for the continuous inference 
equation, the replicator dynamic, is to be of the form of an exponential 
family at each point on the trajectory. Exponential families are max- 
imal entropy distributions [12], a property that corresponds with the 
intuitive explanation of the action of natural selection from the intro- 
duction. Define an exponential family to be a collection of distributions 
of the form 

p{x; 9) = exp (c{x) + £ O^x) - ^{9)\ , 

for functions Fi, C, and i)j and parameter vector 9. These are maximal 
entropy distributions with respect to constraints of the form E [Fi(x)\ = 
Xi and can be derived with Lagrange multipliers. 

The solutions of the replicator equation can be realized as exponen- 
tial families [2j[5j[TT]. Let Xi = exp(-Uj — G) with i)i = fi(x) and G(x) 
a normalization constant to ensure that the distribution sums to one. 
From the fact that ^ Xi = 1, = £\ x^ and so 

= ^^ij = ex-p(vj(x) — G(x))(vi(x) — G(x)) 

i i 

i i 

= f(x) - G(x) 

Hence G = f(x). Now Xi satisfies 

±i = exp(vi(x) - G(x))(vi(x) - G(x)) = x^f^x) - f(x)), 

which is the replicator equation. In the case of a log-linear fitness land- 
scape, explicit solutions can be derived [5]. In this case, the equation 
for the variable v can be reduced to a linear differential equation, which 
can be solved with eigenvalue methods. 

An intuitive discussion of the above result is worthwhile. Entropy 
is a measure of the extent to which the distribution has "spread out" 
over the landscape. Natural selection acts to fit the available niches 
in a fitness landscape, arriving at the maximal entropy distribution 
allowed by the constraints of the landscape. In the absence of varia- 
tion within the fitness landscape, the replicator dynamic is stable. In 
fact, if fi(x) = c for all i and all x then any x is stationary (and a 
Nash equilibrium), and the solution with maximal entropy distribution 
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is the uniform distribution, which follows directly from the definition 
of the exponential family. In the case of a variable fitness landscape, 
natural selection re-aligns the population distribution to fill out the 
landscape if not at equilibrium, at each point taking the maximal en- 
tropy distribution available within the constraints of the values of the 



The connections between inference and evolutionary game theory 
are not just formal coincidence. Information geometry explains the 
commonality. 

4.1. Information Geometry. The set of categorical distributions on 
n variables forms a Riemannian manifold [I] via the Fisher information 
metric 



The exponential map of this manifold gives the exponential families 
of the previous section. In information geometry, the replicator equa- 
tion is known as the natural gradient. The gradient flow of the Fisher 
information metric is the replicator equation. The Fisher information 
metric can be obtained from the Hessian of the Kullback-Liebler infor- 
mation divergence, localizing the asymmetric information divergence 
to the symmetric Fisher information. The metric is known in evolu- 
tionary game theory as the Shahshahani metric and the manifold is 
identified with its embedding into the reals as the (n — l)-dimensional 
simplex. For explicit details see 0. 

In this context, the intreptation of the replicator equation as a con- 
tinuous inference equation is natural, and the properties of the dynamic 
with respect to the information divergence and formal solutions as ex- 
ponential families is less surprising. The replicator equation can now 
be understood as modeling the informational dynamics of the popula- 
tion distribution, moving in the direction of maximal local increase of 
potential with respect to the Fisher information, and ultimately con- 
verging to a minimal potential information state if a stablizing state 
(ESS) exists in the interior of the state space. 

The conceptual situation is similar to that of the Price equation, 
a statistical relationship that models evolutionary processes but itself 
relies on no biological assumptions. Indeed, the Price equation is equiv- 
alent to the replicator equation [13] . Similarly, the Shahshahani geome- 
try of evolutionary game theory has a purely mathematical origin from 
information theory. This means that Kimura's maximum principle and 



/<(*)■ 



4. Understanding the Connection 




THE REPLICATOR EQUATION AS AN INFERENCE DYNAMIC 9 

Fisher's fundamental theorem are statements of mathematical and sta- 
tistical facts that happen to model evolutionary processes rather than 
facts about natural selection itself. 

5. Discussion 

The replicator dynamic is a continuous inference dynamic. It is a 
process guided by the geometry of Fisher information. As a model 
of natural selection, the replicator equation captures the informatic 
change associated to the population distribution. The formal analogy 
of Bayesian inference and the discrete replicator dynamic leads to the 
interpretation and use of information theoretic quantities in evolution- 
ary game theory. 

In particular, the Kullback-Liebler information divergence can be 
used to define a potential information for replicator dynamics, both dis- 
crete and continuous, given an evolutionarily stable state, this property 
is minimized by the action of the dynamic and characterizes evolution- 
ary stability. The concept of exponential families gives formal solutions 
to the continuous replicator dynamic. 
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